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Abstract 

We introduce a class of extensive form games where players might not be able to foresee 
the possible consequences of their decisions and form a model of their opponents which 
they exploit to achieve a more profitable outcome. We improve upon existing models of 
games with limited foresight, endowing players with the ability of higher-order reasoning 
and proposing a novel solution concept to address intuitions coming from real game play. 
We analyse the resulting equilibria, devising an effective procedure to compute them. 


1 Introduction 


While game theory is a predominant paradigm in Artificial Intelligence, the tools it provides 
to analyse real game play still abstract away from many essential features. One of them is the 
fact that in a wide range of extensive games of perfect information (e.g., Chess), humans (and 
supercomputers) are generally not able to fully assess the consequence of their own decisions 
and need to resort to a judgment call before making a move. As acclaimed game theorist 
Ariel Rubinstein puts it, "modeling games with limited foresight remains a great challenge" 
and the game-theoretic frameworks developed thus far "fall short of capturing the spirit of 


limited-foresight reasoning" (Rubinstein, 2004 p.l34). 


On the contrary, the AI approach to game-playing builds upon the assumption that 
complex extensive games like Chess or Go are theoretically games of perfect information, but 
this is only marginally relevant for practical purposes, and the backwards induction solution 
is of little help in predicting how such games are actually played in practice - a point also 
raised in Joseph Halpern’s AAMAS 2011 invited talk "Beyond Nash Equilibrium: Solution 
Concepts for the 21st Century" (Halpern, 20081. Decisions are instead taken using heuristic 


search (e.g., monte-carlo tree search) under various constraints, such as time or memory 


(Russell and Wefald, 19911 (Russel and Norvig, 20121. 


The problem. Search methods are a framework to handle limited foresight and are 
widely used for decision-making in real game-play, but a game-theoretic analysis of their 
equilibrium behaviour is still missing. In particular, we lack the tools to analyse what will 
happen in complex extensive games of perfect information where players are not able to resort 
to backwards induction reasoning but to possibly faulty and incomplete heuristic. What is 
more, the enormous effort to construct players with "opponent modelling" in the AI community 


(e.g., (Schadd et al, 20071 (Donkers et ai, 2001)) still lacks solid game-theoretic foundations. 
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Our contribution. We introduce games in which players might not be able to foresee 
the consequence of their strategic decisions all the way up to the terminal nodes and evaluate 
intermediate nodes according to a concrete heuristic search method. On top of that they 
can reason about other players’ limited foresight and evaluation criteria: they are endowed 
with higher-order beliefs about what their opponents can perceive of the game and how they 
evaluate it, beliefs about what their opponents believe the others can see and how the evaluate 
it, and so forth. To analyse these games, we propose a new solution concept which combines 
higher-order reasoning about players’ limited foresight and evaluation criteria. The guiding 
principle for players ’ behaviour is that each of them chooses a strategy in the game she sees 
that is a best response to the belief about what the other players can see and how they evaluate 
it. We show constructively (Algorithms 1-4) that this solution concept always exists (Theorem 
1) and is a strict generalization of other known ones, e.g., backwards induction. As we will 
observe, the unbounded chain of beliefs underlying our rationality constraints can be finitely 
represented and - rather surprisingly - effectively resolved (Proposition [T^ . 

Related literature. In recent years an innovative tradition has emerged in game theory, 
aiming at capturing situations in which players are unaware of parts of the game they are 
playing and might even think to be playing a different game from the real one. Halpern and 
Rego (Halpern and Rego, 20061, for instance, study models of unawareness of elements of the 
game played (e.g., other players). Yossi Feinberg (Feinberg, 20121 approaches similar problems 
from a syntactic perspective. Simultaneously, the interplay between belief and awareness in 


interactive situations is analysed in a series of papers by Heifetz, Meier and Schipper (Heifetz 


et al., 2006 

), ( 

Heifetz et al, 2013a 

1, ( 

Heifetz et al., 2013b 


It should be noted that even though all these frameworks abstractly allow to talk about 
unawareness of some terminal histories in a game, none of them comes equipped with a 
solution concept capturing limited foresight reasoning. 


A framework that comes closest, perhaps, to this is Games with Short Sight (Grossi and 


Turrini, 2012), a well-behaved collection of games with awareness, in which players of an 


extensive game make choices without knowing the consequences of their actions and base 
their decisions on a (possibly incorrect) evaluation of intermediate game positions. 

Games with Short Sight (GSSs) have been studied in relation with a solution concept called 
sight-compatible backwards induction: as players might not be able to calculate all possible 
moves up to the terminal nodes, they play rationally in a local sense, executing moves that 
are backwards induction moves in their own sight, therefore safely assuming their opponents 
see as much of the game as they do. 

However, sight-compatible backwards induction precludes any sort of opponent modelling, 
as players are not allowed to have a non-trivial belief about what their opponents perceive. 
Thus, the tools developed in (Grossi and Turrini, 2012) to analyse GSSs only allow players 
to play approximately or inaccurately, they don’t allow players to exploit their opponents’ 
believed weaknesses. Besides, GSSs employ heuristics which are not grounded in practical 
game-play. Essentially, players come equipped with a preference relation over all histories of 
the game. 

We will avoid strong rationality requirements of this kind, by introducing a significantly 
higher level of complexity in players’ reasoning - notably their ability of forming an "opponent 
model" - which, it turns out, still remains computationally manageable. Also players’ 
preference relations will not be taken as given, but derived from concrete search methods. 

An important research line in AI that has similarities with our approach is interactive 
POMDPs (Gmytrasiewicz and Doshi, 2005), which is able to incorporate higher-order 
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epistemic notions in multi-agent decision making, with focus on learning and value/policy 
iteration. These graph-like models are generally highly complex - in fact the whole approach 
is known to suffer from severe complexity problems when it comes to equilibrium analysis 


and approximation methods have been devised to (partially) address them (Doshi and 
Gmytrasiewicz, 2009),(Sonu and Doshi, 20151. Instead, we present a full-blown game- 


theoretic model of limited foresight that allows for higher-order epistemic notions and yet 
keeps equilibrium computation within polynomial time. 

Paper Structure. Section "Games with limited foresight" recalls useful formal notation 
and definitions from the literature upon which we build and introduces the mathematical 
structures we will be working on, Monte-Garlo Tree Games. Section "Rational beliefs and 
limited foresight" studies the higher-order extension thereof, Epistemic Monte-Garlo Tree 
Games. Specifically, we go on and define a new solution concept which takes this higher- 
order dimension into account and we then show the existence of the new equilibria through an 
efficient (P-TIME) algorithm. Section "Gonclusion and potential developments" summarises 
our findings and hints at new research avenues opening up in our framework. 

2 Games with limited foresight 

We start out with the definition of extensive games, on top of which we build the models of 
limited foresight. 


Extensive Games An extensive game form (Osborne and Rubinstein, 19941 is a tuple 


{N, 

where [1] is a finite non-empty set of players. [2] H \s a non-empty prefix-closed set of 
sequences, called histories, drawn from a set A of actions. A history {a^)k=i,...,K ^ H \s 
called terminal history if it is infinite or if there is no such that {a^)k=i,...,K+i £ H. The 
set of terminal histories is denoted Z. A history h is instead called quasi-terminal if for each 
a £ A, if {h, a) £ H, then (h, a) is terminal. If /i £ is a prefix (resp., strict prefix) of h' € H 
we write h < h' (resp., h <h'). With = {a £ A | (/i, a) £ H} we denote the set of actions 
following the history h. The restriction of H' O H to h € H, i.e., {(/i, h') € H \ {h, h') £ H'} 
is denoted H'\h- [3]t : H \ Z ^ N is a turn function, which assigns a player to each 

non-terminal history, i.e., the player who moves at that history. [4] Sj is a non-empty set 
of strategies. A strategy of player i is function cTj : {/i £ H\Z \ t{h) = i} —)■ Ah, which 
assigns an action in Ah to each non-terminal history for which t{h) = i. [5] o is the outcome 
function. Eor each strategy profile a = outcome o(cj) of S is the terminal 

history that results when each player i follows the precepts of at. 

An extensive game is a tuple £ = {Q,{ui}i^M), where Q is an extensive game form, 
and Uj : Z —7- M is a utility function for each player i, mapping terminal histories to reals. 
We denote Z x Z the induced total preorder over Z and BI(T) the set of backwards 


induction histories of extensive game £, computed with the standard procedure (Osborne and 


Rubinstein, 1994 Proposition 99.2). 


Sight Functions and Forked Extensions On top of the extensive game structure, each 
player moving at the certain point in the game is endowed with a set of histories that he or 
she can see from then on. 
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Consider an extensive game £ = {Q, {ui]i^jq) . A (short) sight function for £ (Grossi 
id Turrini, 20121 is a function 

s : H\Z 2^\0 

associating to each non-terminal history h a finite non-empty and prefix-closed subset of all 
the histories extending h, i.e., histories of the form {h,h'). We denote H\h= s{h) the sight 
restriction on H induced by s at h, i.e., the set of histories in player t(/i)’s sight, and Z\h 
their terminal ones. Intuitively, the sight function associates any choice point with those 
histories that the player playing at that choice point actively explores. 


In (Grossi and Turrini, 20121 the problem of evaluating intermediate positions is resolved 
by assuming the existence of an arbitrary preference relation over these nodes, which is 
common knowledge among the players. What we do instead is to introduce an extension 
of sight functions that models the evaluation obtained by a concrete search procedure. The 
idea is that in order to evaluate intermediate positions, each player carries out a selection 
and a random exploration of their continuations, all the way up to the terminal nodes. The 
information obtained is used as an estimate of the value of those positions. This is an encoding 


of a basic Monte-Garlo Tree Search (Browne et aZ., 20121. 

Let (T, s) be a tuple made by an extensive game £ and a sight function s. Sight function 
s* is called a forked extension of sight function s if the following holds: 


• s(h) C s*(h) i.e., the forked extension prolongs histories in the sight it extends; 

• For [* being the sight restriction calculated using s* as sight function, we have that: if 
h E s*(h) \Z then there exists h' E s*{h) such that h<lh', i.e., s*{h) is made of histories 
that go all the way up to the terminal nodes. 

A Monte-Carlo Tree Game (MTG) is a tuple S = {£,s,s*) where £ = (^,{ui}jgjv) 
is an extensive game, s a sight function for £ and s* a forked extension of s. We denote 
51" {G\h,{ui\h}ieN) the sight restriction of S induced by s at h, where G\h is the game 
form G restricted to H\h and the utility function n|'/i:A'xZ|'/i—)-Mis constructed as follows. 
For each i £ N, g £ Z\h, we have: 


Ui\h{9)= avg Ui{z) 
zeZll,g<\z 


So the utility function at terminal histories in a sight is computed by taking the averag^ 
of the histories contained in its forked extension. Notice the following important point: 
histories in the forked extension are truly treated as "random" explorations, with no rationality 
assumptions whatsoever, in order to construct a preference relation over Z\h. Sight-restriction 
is applied to players, turn function, strategies and outcome function in the obvious way. 
Summing up, each structure {G\h, {ui\h}ieN) is an extensive game, intuitively the part of the 
game that the player moving at h is able to see, where the terminal histories are evaluated 
with a monte-carlo heuristic. 


further natural constraint on forked sight functions is that of monotonicity, i.e., players do not forget 
what they have calculated in the past. Formally s is monotonic if, for each h, h' such that t(h) — t(h') and 
h <\ h', we have that s*(/i)L/ C s*{h'). Albeit natural, this assumption is not needed to prove our results. 

^Averaging has the sole purpose of simplifying notation and analysis, which carries over to any aggregator, 
with or without lotteries. Besides, it comes along with a few desirable properties, notably the fact that forked 
extensions never miss dominated continuations, i.e., moves that ensure a gain no matter what the opponents 
do. For quantified restrictions on aggregators cfr. for instance (van Benthem et al., 20111. 
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The solution concept proposed in (Gross! and Turrini, 20121 to analyse GSSs is sight- 
compatible backwards induction: a choice of strategy, one per player, that is consistent with 
the subgame perfect equilibrium of each sight-restricted game. We can encode it as follows. 


Definition 1 (Sight-compatible BI) Let S' be a MTG. A strategy profile cr is a sight- 
compatible backwards induction if at each h ^ H, there exists a terminal history z € Z\h 
such that ha{h) < z and z G BI(5|'/i). The set of sight-compatible backwards induction 
outcomes of S is denoted SCBI(5) C Z. 


Thus, a sight compatible backwards induction is a strategy profile a that, at each history h, 
recommends an action a that is among the actions initiating a backwards induction history 
within the sight of the player moving at h. This, notice, is different from the backwards 
induction solution of the whole game, because players evaluation of intermediate nodes might 
not be a correct assessment of the real outcomes of the game. Grossi and Turrini show that 
the SGBI solution always exists, even in infinite games. 

Despite their effort in modelling more procedural aspects of game play, though, GSSs still 
lack non-trivial opponent modelling, i.e., players allowing for their opponents to “miss” future 
game developments and evaluate game positions differently (or any higher-order iteration of 
this belief), while adjusting their behaviour accordingly. 

The rest of the paper is devoted to extending MTGs with more realistic but highly more 
complex reasoning patterns, generalising both GSSs and SGBI. This, it turns out, does not 
prevent us from having appropriate well-behaved solution concepts which generalise classical 
ones, such as backwards induction. 


3 Rational beliefs and limited foresight 

We now introduce an extension of MTGs, where players are allowed for the possibility of 
higher-order opponent-modelling, i.e., to have an explicit belief about what other players can 
see and how they evaluate it, a belief about what other players believe other players can 
see and how they evaluate and so forth, compatibly with players’ sight. We study a solution 
concept for these games and relate it to known ones from the literature. 

3.1 Players’ sights and belief chains 

Let us introduce the idea behind higher-order opponent modelling in MTGs using an example. 
We will then move on to define the notions formally. 

Example 2 (An intuitive solution) Consider the game shown in Figure 1. Three players, 
Ann, Bob and Charles, move at histories marked A, B and C, respectively. The circle 
surrounding history A indicates what Ann believes she can see from history A, which we write 
b[A). This, intuitively, coincides what Ann can actually see, i.e., it equals s(A), Ann’s sight 
at history A. What should Ann do in this situation? This depends on what Ann believes will 
happen next. If Ann knew this, her choice would only be a maximization problem: finding the 
action that, given what will happen in the future, gets her the maximal outcome, according to 
her evaluation from A - which we write ^^nn- what Charles will do, Ann considers 

her belief about what Charles can see from C, which we indicate with b{A)b{C). Note this 
may have nothing to do with what Charles actually sees from C, i.e., b(C). In Figure 1, for 
instance, Ann believes that Charles can only see d from C. The question of what Charles will 
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Figure 1: A belief structure (modulo evaluations) 


do is then easily answered, even without eonsidering his preferenee relation ^^nn’ what Ann 
believes Charles wants from history C. Charles, aeeording to Ann, will eertainly go to d. The 
next question is: what will Bob do? This, again, will depend on b{A)b{B), the portion of 
Ann’s sight that Ann believes Bob ean see from B and on the preferenees Ann believes 

Bob has at B. But, at least aeeording to Ann, Bob ean also see that Charles ean make moves. 
So, for Bob to deeide what to do, he must first find out what Charles will do - b(A)b(B)b(C) 
- aeeording to ■ This is also an easy task, sinee e is the only option. The ehoiee at 

b(A)b(B)b(C) is then determined, but so is then the ehoiee at h{A)b{B). Now all that is left 
for Ann to do is to solve her maximization problem, determining the ehoiee at b{A). 

Now we concentrate on turning the intuitions in the example into formal definitions. To 
do so, we introduce the notion of history-sequenee. A history-sequence is a formal device that 
allows to represent higher-order beliefs about other opponents, consistently with a players’ 
sight. 

Definition 3 (History-Sequences) Consider a MTC S = {{{N, H,t,Tii, o), {ui}i£N), s, s*). 
A history-sequence q of S is a sequenee of histories of the form {ho, /ii, /i 2 , • • • , /ife) sueh that 

• hj G for every j G {1,2, ■ ■ ■ k}, i.e., histories following ho in the sequenee are 

histories within the sight of the player moving at ho; 

• hj <1 hj+i for eaeh j with 0 < j < k, i.e., eaeh history is a striet postfix of the ones with 
lower index; 

The underlying idea behind this definition is to consider the higher-order point of view 
of the player moving at ho. Expressions of the form (/iq, hi,/ 12 , • • • ,hk) encode the belief 
that player moving at ho holds about the belief that player moving at hi holds about the 
belief that player moving at /i 2 holds ... about what the player moving at h^ can see and 
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what the evaluation is of the corresponding terminal histories. We use Q to denote the set of 
history-sequences of S. 

Building upon the notion of history-sequence, we can define what we call sight-compatible 
belief structures, associating each history-sequence with a set of histories and an evaluation 
over the terminal ones in this set. 

Definition 4 (Sight-compatible belief structures) Let 5 be a MTG. A sight-compatible belief 
structure B for S' is a tuple (Bh, Bp) such that Bp is a function Bp : Q —)■ 2^, associating 
to each history-sequence (/iq, hi, /i 2 , • • • , hk) a set of histories in s(/io) extending hk, and Bp 
is a function Bp : Q —)■ 2^ associating to each history-sequence q a set of terminal histories 
extending histories in Bp(q). B satisfies the following conditions: 

• (Corr) Vq G Q with q = (ho), then Bp(q) = H\hQ whenever t{hk) the belief of a player 
about what he himself can see is correct. E] 

• (Mon of Bp) V q, q' G Q, if 3h' G Bp(q) s.t., q' = (q,/i'), then Bp(q') C Bp(q)|/i/, 
i.e., if a player believes someone is able to perceive a portion of the game, then he is 
able to perceive that portion himself. 

• (Mon of Bp) V q, q' G Q, if 3h' G Bp(q) s.t., q' = (q, h'), then Bp(q') C Bp(q)|/j/, 
i.e., if a player believes someone is able to explore a position, then he is able to perceive 
that exploration himself. 

For q = (/iQ, hi,/i 2 , • • • ,hk), Bp(q) denotes the higher-order beliefs (in the order given 
by q) about how player moving at hk is evaluating the terminal histories in Bp(q) under the 
unique forked extension of s whose terminal histories are Bp(q). denotes the induced 

preference relation, one per player. 

The conditions above, we argue, are most natural constraints on sight-compatible higher- 
order beliefs. For the time being we do not commit ourselves to any other constraints on either 
Bp or Bp, but we acknowledge that different contexts may warrant further constraints on 
both. 

Definition 5 (Epistemic Monte-Carlo Tree Games) An Epistemic Monte-Carlo Tree 
Games (EMTGs) is a tuple S = (5, B) where S is a MTG and B a sight-compatible belief 
structure for S. 

An EMTG is obtained by assigning a sight-compatible belief structure to a MTG. 
One should observe how sight-compatible belief structures induce, at each history, a whole 
collection of extensive games, one for each possible history-sequence. For instance, the one 
resulting from Ann’s sight and her evaluation, the one resulting from Ann’s belief about Bob’s 
sight and his evaluation and so forth. Structures of the form 5[B(q) can now be naturally 
defined, as restrictions induced by B(q) on S, adopting Bp(q) as sight-restriction, and Bp(q) 
as evaluation function, with the induced preference relation. 

■^One might want to impose stronger variants of correctness. For instance the fact that if a player can see 
he will be moving again, then he will consider at least as much as he is considering now, from that history on: 
Vq G Q, if 5 = {ho,hi,.. .,hk), q' = (hi ,... ,hk) and t{ho) = t{hi) then Bii(q') = BnlqlLo- 

We can also impose that this fact is common knowledge among the players: Vq G Q, if g = 
{ho, hi, hi-i,hi, hi+i ..., hk), q' = {ho, hi, hi-i,hi+i ... ,hk) and t{hi) = t{hi-i) then B^lq') = BnlqlLi. 
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3.2 Analysing EMTGs 

Example has illustrated a natural notion of solution in an epistemic MTG, where each 
player calculates a best action in his or her sight restriction according to his or her evaluation 
criteria, recursively computing both the sight and the evaluation criteria of the other players. 
This is the idea behind the solution concept we propose for EMTGs. 


Definition 6 (Nested Beliefs Solution) Let S = (5, B) be an EMTG and let q = 
- ,hk) be a history-sequenee. A strategy profile cr|'B(q) is a Nested Beliefs Solution 
(NBS) 0/5(3(5) if- 


Base step For eaeh h' G H\^ 




.Bp{q,h') 


that is a quasi-terminal history of we have that 

h'^<^'\BH{q,h')iyi^') f^^ ^'\BH{q,h') “5^65 with 


up to h'. 


Induction step For eaeh h' 
H\Bu[q), we have that 


G i/[ 3 /^( 5 ) is neither terminal nor quasi-terminal in 


• '^rB{ 5)(^0 o-grees at h' with some Nested Beliefs Solution o/5[ 3 ( 5 ,ft')- 

• If, for eaeh h' G that is neither terminal nor quasi-terminal in II|'B^(q), 

we have that cr'\-Q(^q^{h') agrees at h' with some Nested Beliefs Solution of 

then the outeome z' generated by (t'|'b( 5 ) following h and the outeome z generated 

by o'l'B(q) following h are sueh that z z'. 


We denote NBS(5[B(q)) the set of NBS outcomes of (5,q). The composition of such 
outcomes yields our game solution. 

Intuitively, a Nested Beliefs Solution of some game 5[B(q) is a best response to all Nested 
Belief Solutions at deeper level, e.g., of each 5[B(q_/j/). Notice that because of the properties 
of sight functions the depth iteration is bound to reach a fixpoint. 

Example 7 LeVs go baek to Figure 1 and eompute the NBS at history A. We know there 
are four relevant histories sequenees: {A), {A,C), {A, B) {A,B,C). To eaeh of them we ean 
associate the corresponding beliefs, as follows: 


• {9,d,e,f,C,B,A} = H\ho 

• {e, G,/} 

• dI\BH{A,C)= 

• H\B„iA,B,C)= {e} 

Let us know, for each histories sequence q specify the preference relation (modulo 

refilexivity and transitivity), which is all we need to compute NBS. 

,Bp{A,C)_ ri 

• —Charles ~ 1 / 
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^p{A,B,C)_ .. 

• —Charles ~ t/ 

Consider now the following strategy c’'[b(A) 

• ^\'B(A){-^)Ann = 9 

• (^\-B(A){B)Boh = C 

• <^\-B(A){C) Charles = d 

Is a a Nested Beliefs Solution of S\-b(a) ^ 

The eondition at the base step is met by (t\-q{a){C)C harles = d. 

Lets now look at o'|'B(A)(^)-Bofe = C. Is C eompatible with the best Nested Beliefs Solution 
of S\-b{a,b)? Wo need first to eompute all NBS o/5|'b(A,s)- Luekily there are not so many. 
Every sueh strategy must be of the form (t'\-b(^a,b){C)C harles = o and be the best among the 
strategies agreeing with NBS o/5|'b(A,b,c) ot C. So, given the preferenees of B, be sueh that 
W\-b[a,b){B)B oh = C. This is indeed what a does. 

However notiee that given the preferenee of A, a is not behaving as a NBS at A, beeause 
Ann prefers d to g. 

The strategy a* |'b(A) only disagreeing with c’'|'b(A) ot {A), and being sueh that a\-Bi^A){W)Ann 

B, is a NBS of S\^^^A)- 

The composition of Nested Beliefs Solutions constitutes a rational outcome of the game. 

Definition 8 (Sight-Compatible Epistemic Solution) Let S = (S', B) be an EMTG. A 
strategy profile a is a Sight-Compatible Epistemic Solution (SCES) if at eaeh h € H \ Z, 
there exists a terminal history z G •Z^|'bj^(/i) sueh that ha{h) < z and z G iVBS(S|'B(/i))- 

We denote SCES the set of Sight-Compatible Epistemic Solutions of S. 

A SCES is the composition of best moves of players at each history. Each such move 
is a best response to what the current player believes other players will do and this belief 
is supported by all higher-order beliefs, compatible with the player’s sight, about what the 
opponents can perceive and how they will evaluate it. 

3.2.1 Computing rational solutions 

Algorithm Sol{S) below takes as input an EMTG and returns a path obtained by composing 
locally rational moves, compatible with players’ higher-order beliefs about sights and 
evaluation criteria of their opponents. Algorithms calls Algorithm which in turn calls 
Algorithms and For technical convenience, we define VLB to be a dummy always 
dominated history. 

The following theorem shows that every EMTG has a Sight-Compatible Epistemic 
Solution. Its proof consists in constructively building the desired strategy profile. 

Theorem 9 (Existence Theorem) Let S = {S, B) be an EMTG. There exists a strategy 
profile a that is a Sight-Compatible Epistemie Solution for S = (5, B). 

■^Slightly abusing notation, but unambiguosly, we identify actions chosen by the strategy with the resulting 
histories. 
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Algorithm 1: Solution of S 

1 Sol{S) 

Input: An EMTG S = {S,B) 

Output: A terminal history h ot S 

2 begin 


3 

h i — £] 


4 

while h ^ Z do 


5 

L /i ^ {h,BSBI{S,h)y, 

/* NBS at h */ 

6 

Return h] 



Algorithm 2: The current best move 


1 BSBI{S,h) 

Input: A game S = (S', B), and a history h 
Output: NBS move a at h 

2 begin 

3 for each h' G ^H{h) and h' ^ h do 

4 Continuations[h’] i — NBS(^S,(^h,h^)y, /* store NBS actions in an array, one for 

each h' */ 

5 Return SS(S, (/i), Continuations); 


Proof Sketch: Let H be the set of histories in S. For every h ^ H set cr(/i) := a for a<h* and 
h* be the outcome returned by Algorithm on input S[b(/^ That the Algorithm returns a 
profile o(cr) such that a satisfies the conditions of Definition^ at each history is a lengthy but 
relatively straightforward check, which we omit for space reasons. □ 

Theorem 10 (Completeness Theorem) Let S = (S, B) be a finite EMTG and let a be a 
Sight-Compatible Epistemic Solution for S = (S, B). There exists an execution of Algorithm 
returning o{a). 

Proof Sketch: Let S = (S, B) be a finite EMTG and let a be a Sight-Compatible Epistemic 
Solution for S = (S, B). Now choose an execution of Algorithm that is compatible with the 
action selection that, at each history sequence, is made by a, which exists by construction. 
The finiteness assumption ensures termination. □ 


The following observations illustrate the relation between SCES and the other two relevant 


solution concepts in the literature: SCBI (Grossi and Turrini, 2012) and classical BI (Osborne 


and Rubinstein, 1994). They specify precise conditions under which our solution concept 


collapses into these two. 


Propositiou 11 LetS = (S, B) be an EMTG. If for any history-sequence q = (/iq, ^i, • • • , hk) 
and any history h' G Bjy(q), Biy(q,/i') = Biy(q)|;i/, and Bp(q) = Bp{ho), then 

SCES(5)=SCBI(5). 


So, if the current player believes the following players’ sights and evaluation criteria, 
together with their beliefs about other players’ sights and evaluation criteria, are coherent 
with his’, then SGES is equivalent to SGBI. 
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Algorithm 3: Beliefs of moves of following players 


1 


2 

3 

4 

5 

6 
7 


NBS{S, q) 

Input: A game S = {S, B), and a history sequence q = {ho, h\,h 2 , - ■ ■ , h^) 

Output: An action a following hk 

begin 

if hk G then 

Return e; 
else 

for each hk+i G Bj:/(q) and hk+i 7 ^ hk do 

Continuations[hk^l] ^— NBS{S,{c[,hk-{-l))', /* store NBS actions in an array, 


one for each /ife+i */ 


8 


Return BB{S,q, Continuations)] 


We know that the solution concept BI is a special case of SCBI, and therefore also of 
SCES. 

Proposition 12 LetS = (S', B) be an EMTG. If, for any history-sequence q = {ho, hi, ■ ■ ■ , hk), 
we have that Biy(q) = and that Bp(q) = Pt(h^.) then SCES(S)=BI(S). 

The above result says that SCES coincides with standard backwards induction solution 
if, at each history, we have that higher-order beliefs about sight and evaluation criteria are 
coherent with the real subgame the current player faces and the preference relation the current 
player holds. 

Despite the crucial presence of higher-order beliefs about sight-restricted games, we can 
show the following fairly surprising complexity result. 

Proposition 13 Given a finite EMTG S, the problem of computing a SGES of S is P-TIME 
complete. 


Proof sketch: For the upper bound, the key fact is that algorithm Sol{S) runs in time 
0((nlogn)^), with n being the cardinality of the set of histories of S. This follows from the 
equations and facts below, where b and d are the largest number of branches and the depth 
of game tree respectively: 1). T{sol{S)) = 0{T{BSBI* d)); 2). T{NBS) = 0{T{BSBI)); 3). 
T{BB) = 0{b * d); 4). Let f{d)=T{BSBl), then f{d)=0{b * f{d - 1) + 6^ * f {d -2) + h^* 

f{d-2,) + --- + h^-^f{l)+T{ BB))=0{d*2‘^* h‘^)-, 5). d <log{n) <b<^ < n. P-TIME 

hardness is a consequence of (Szymanik, 2013, Theorem 2), which shows that BI is P-TIME 
hard, and Proposition |12[ 

As a side remark, using a similar argument and Proposition we are able to show that 
computing SCBI solutions is P-TIME complete. 


4 Conclusions and potential developments 

We have proposed a model for decision-making among resource-bounded players in extensive 
games, integrating an analytical perspective coming game theory with a procedural perspec¬ 
tive coming from AI. In particular we have studied players with limited foresight which can 
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Algorithm 4: Best Branch 


1 q, Continuations) /* Compose chosen moves (in array Continuations), thus get all 
paths following hk and choose a best move following hk */ 

Input: A game S, a history sequence q, an array Continuations 
Output: A best move following hk determined by Continuations 

2 begin 

3 bestpath VLP; /* VLP is a dominated history for all players */ 

4 for each £ Bjy(q') do /* a is any action following hk , next we choose an 

optimal one in B//(q) */ 

5 TP (hk, a)] 

6 while Continuations[TP\ is defined in array Continuations do 

7 1^ TP <r- ( TP, Continuations) TP] ); 


8 

9 

10 

11 


if TP bestpath then 


t{hk) 

bestpath £ 
bestmove 


Return bestmove] 


TP] 
- a] 


reason about their opponents, constructing beliefs about their limited abilities for calculation 
and evaluation, showing that our novel games have a well-behaved solution, generalising 
existing ones in the literature. 

There are interesting modelling issues, as noted previously. Our game models strike a 
balance between simple trees as used for BI and more complex models as found in epistemic 
game theory (Perea, 20121. Here, what we left open is the relation between EMTGs and 
the Extensive Games with Awareness of (Halpern and Rego, 20061. We expect that the 
correspondence for GSSs of Theorem 3 in (Grossi and Turrini, 20121 can be lifted to EMTGs, 
using an iteration of the awareness functions Awi for players i to simulate the believed game 
at a history sequence. We stress, though, that the specific features of EMTGs give them 
an independent conceptual and technical interest. The emphasis on limited foresight (as 
opposed to perceiving a novel extensive game in (Halpern and Rego, 2006)) makes them a 
natural candidate for addressing Rubinstein’s modelling challenge (Rubinstein, 2004), while 
still supporting an efficient algorithm to calculate the game equilibria. 

Finally, our analysis raises several issues of logical definability and styles of reasoning. We 
believe that our solution concept is still definable in a computationally well-behaved logical 


language, a natural candidate being the fixed-point logic FOL(FP), shown in (van Benthem 


and Gheerbrant, 2010) to express backwards induction. What is new in our setting is that 


the reasoning underpinning our main theorems is a mixture of a backward induction style 
with a forward induction style (Perea, 2012; van Benthem, 2014), since we have to evaluate 


what players further down in the game tree are going to do according to players whose moves 
occurred earlier on in the game. 
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